Policy Search for Path Integral Control

نویسندگان

Vicenç Gómez

Hilbert J. Kappen

Jan Peters

Gerhard Neumann

چکیده

Path integral (PI) control defines a general class of control problems for which the optimal control computation is equivalent to an inference problem that can be solved by evaluation of a path integral over state trajectories. However, this potential is mostly unused in real-world problems because of two main limitations: first, current approaches can typically only be applied to learn openloop controllers and second, current sampling procedures are inefficient and not scalable to high dimensional systems. We introduce the efficient Path Integral Relative-Entropy Policy Search (PI-REPS) algorithm for learning feedback policies with PI control. Our algorithm is inspired by information theoretic policy updates that are often used in policy search. We use these updates to approximate the state trajectory distribution that is known to be optimal from the PI control theory. Our approach allows for a principled treatment of different sampling distributions and can be used to estimate many types of parametric or non-parametric feedback controllers. We show that PI-REPS significantly outperforms current methods and is able to solve tasks that are out of reach for current methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Policy Search for Imitation Learning

Efficient motion planning and possibilities for non-experts to teach new motion primitives are key components for a new generation of robotic systems. In order to be applicable beyond the well-defined context of laboratories and the fixed settings of industrial factories, those machines have to be easily programmable, adapt to dynamic environments and learn and acquire new skills autonomously. ...

متن کامل

Numerical solution of higher index DAEs using their IAE's structure: Trajectory-prescribed path control problem and simple pendulum

In this paper, we solve higher index differential algebraic equations (DAEs) by transforming them into integral algebraic equations (IAEs). We apply collocation methods on continuous piece-wise polynomials space to solve the obtained higher index IAEs. The efficiency of the given method is improved by using a recursive formula for computing the integral part. Finally, we apply the obtained algo...

متن کامل

Path Integral Stochastic Optimal Control for Reinforcement Learning

Path integral stochastic optimal control based learning methods are among the most efficient and scalable reinforcement learning algorithms. In this work, we present a variation of this idea in which the optimal control policy is approximated through linear regression. This connection allows the use of well-developed linear regression algorithms for learning of the optimal policy, e.g. learning...

متن کامل

An Iterative Path Integral Stochastic Optimal Control Approach for Learning Robotic Tasks

Recent work on path integral stochastic optimal control theory Theodorou et al. (2010a); Theodorou (2011) has shown promising results in planning and control of nonlinear systems in high dimensional state spaces. The path integral control framework relies on the transformation of the nonlinear Hamilton Jacobi Bellman (HJB) partial differential equation (PDE) into a linear PDE and the approximat...

متن کامل

Acceleration of Gradient-based Path Integral Method for Efficient Optimal and Inverse Optimal Control

This paper deals with a new accelerated path integral method, which iteratively searches optimal controls with a small number of iterations. This study is based on the recent observations that a path integral method for reinforcement learning can be interpreted as gradient descent. This observation also applies to an iterative path integral method for optimal control, which sets a convincing ar...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Policy Search for Path Integral Control

نویسندگان

چکیده

منابع مشابه

Policy Search for Imitation Learning

Numerical solution of higher index DAEs using their IAE's structure: Trajectory-prescribed path control problem and simple pendulum

Path Integral Stochastic Optimal Control for Reinforcement Learning

An Iterative Path Integral Stochastic Optimal Control Approach for Learning Robotic Tasks

Acceleration of Gradient-based Path Integral Method for Efficient Optimal and Inverse Optimal Control

عنوان ژورنال:

اشتراک گذاری